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We propose a new method for the determination of the weight factor for the simulated temper- 
ing method. In this method a short replica-exchange simulation is performed and the simulated 
tempering weight factor is obtained by the multiple-histogram reweighting techniques. The new 
algorithm is particularly useful for studying frustrated systems with rough energy landscape where 
the determination of the simulated tempering weight factor by the usual iterative process becomes 
^^ , very difficult. The effectiveness of the method is illustrated by taking an example for protein folding. 
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I. INTRODUCTION 

In complex systems such as spin glasses and biopolymers, it is very difficult to obtain accurate canonical distributions 
at low temperatures by conventional Monte Carlo (MC) and molecular dynamics (MD) simulation methods. This is 
because simulations at low temperatures tend to get trapped in one of huge number of local- minimum-energy states. 
One way to overcome this multiple-minima problem is to perform a simulation in a generalized ensemble where each 
state is weighted by a non-Boltzmann probability weight factor so that a random walk in potential energy space 
may be realized. (For a review of generalized-ensemble approach in the protein folding problem, see, e.g., Ref. |I|.) 
The random walk allows the simulation to escape from any energy barrier and to sample much wider phase space 
than by conventional methods. Monitoring the energy in a single simulation run, one can obtain not only the global- 
minimum-energy state but also canonical ensemble averages as functions of temperature by the single-histogram H 
and/or multiple-histogram [g[Q] reweighting techniques. 

Two of the most well-known generalized-ensemble algorithms are perhaps multicanonical algorithm (MUCA) lq| 
and simulated tempering (ST) Jo,[7| (the latter method is also referred to as the method of expanded ensemble [|6[). 
(For reviews, see, e.g., Refs. BM). A simulation in MUCA performs a free ID random walk in potential energy space. 
This method and its generalizations have already been used in many applications in protein systems (see, for instance. 



Refs. 10 1 - |20f|). A simulation in ST performs a free ID random walk in temperature space, which in turn induces a 
random walk in potential energy space and allows the simulation to escape from states of energy local minima again. 
ST has also been applied to protein folding problem |2l]-|2^. 

The generalized-ensemble algorithms are powerful, but in the above two methods the probability weight factors 
are not a priori known and have to be determined by iterations of short trial simulations 1 5HW| . This process can be 
non-trivial and very tedious for complex swtems with many local-minimum-energy states. 

The replica-exchange method (REM) [p5| , p6[ alleviates this difficulty. (A similar method was independently de- 
veloped earlier in Ref. [ p7[ . REM is also referred to as multiple Markov chain method pSf and parallel tempering 
y|.) In this method, a number of non-interacting copies of the original system (or replicas) at different temperatures 
are simulated independently and simultaneously by the conventional MC or MD methods. Every few steps, pairs of 
replicas are exchanged with a specified transition probability. The weight factor is just the product of Boltzmann 
factors, and so it is essentially known. We have developed a molecular dynamics algorithm for REM |29] (see, also, 
Ref. Q ) . We then developed a multidimensional REM |^ . 

However, REM also has a computational difficulty: As the number of degrees of freedom of the system increases, 
the required number of replicas also increases, and so does the required computation time. This is why we want to 
combine the merits of MUCA and ST and those of REM so that we can determine the weight factors for MUCA 
and ST with ease and save the computation time greatly. We have presented such an example; we developed a new 
method for the determination of the multicanonical weight factor p2| , |33| . The method was referred to as the replica- 
exchange multicanonical algorithm (REMUCA). In REMUCA, a short replica-exchange simulation is performed, and 
the multicanonical weight factor is determined by the multiple- histogram reweighting techniques y,^. 

In this Letter we present another example of such a combination. In the new method, which we refer to as the 
replica-exchange simulated tempering (REST), a short replica-exchange simulation is performed, and the simulated 
tempering weight factor is determined by the multiple-histogram reweighting techniques PM]. The effectiveness of 
the method is tested with a penta peptide, Met-enkephalin, in gas phase. 

II. METHODS 

We first briefly review the original simulated tempering (ST) method P,n|. In this method temperature itself 
becomes a dynamical variable, and both the configuration and the temperature are updated during the simulation 
with a weight: 

H^sT(i^;r) = e-'^^+'^(^), (1) 

where /3 — l/ksT {ks is the Boltzmann constant), E is the potential energy, and the function a{T) is chosen so that 
the probability distribution of temperature is given by 

-Pst(T) = I dE n{E) Wst{E;T) = j dE n{E) g-'^^+'^m = const , (2) 

where n{E) is the density of states. Hence, in ST the temperature is sampled uniformly. A free random walk 
in temperature space is realized, which in turn induces a random walk in potential energy space and allows the 
simulation to escape from states of energy local minima. 



In the numerical work we discretize the temperature in M different values, Tm (to = 1, • • • , M); we can order the 
temperature so that Ti < T2 < • ■ ■ < Tm ■ The lowest temperature Ti should be sufhciently low so that the simulation 
can explore the global-minimum-energy region, and the highest temperature Tm should be sufficiently high so that 
no trapping in a local- minimum-energy state occurs. The probability weight factor in Eq. (|l|) is then given by 

W^ST(i?;r™)=e-^'"^+"- , (3) 

where a™ = a{Tm) (rn = 1,- • -jM). The parameters a,„ are not a priori known and have to be determined by 
iterations of short simulations (see, e.g., Refs. P,ElU23] for details). This process can be non-trivial and very difficult 
for complex systems. Note that from Eqs. (0) and (bT we have 



f dE n{E) e-'^"'-^ . (4) 



The parameters am are therefore "dimensionless" Helmholtz free energy at temperature T„i (i.e., the inverse temper- 
ature Pm multiplied by the Helmholtz free energy). 

Once the simulated tempering parameters a™ are determined and the initial configuration and the jnitial temper- 
ature Tm are chosen, a ST simulation is realized by alternately performing the following two steps 



1. A canonical MC or MD simulation at the fixed temperature Tm is carried out for a certain MC or MD steps. 

2. The temperature Tm is updated to the neighboring values Tm±i with the configuration fixed. The transition 
probability of this temperature- updating process is given by the Metropohs criterion (see Eq. (||)): 

frr rr ^ ( I , for A < , ,^s 

w{Tm - T,„±i) ^ I ^^p ^_^^ , for A > , ^^^ 

where 

A = {Pm±l - f3m) E - {am±l - am) ■ (6) 

In the present work, we employ Monte Carlo algorithm for Step 1. After a long ST production run, one can obtain 
canonical ensemble averages as functions of temperature by the multiple-histogram rewcighting techniques [^Jj] as 
described in detail below. 

The replica- exchange method (REM) was developed as an extension of simulated tempering |25|| (thus it is also 
referred to as parallel tempering H) (see, e.g., Ref. [p9| for a detailed description of the algorithm). The system for 
REM consists of M non-interacting copies (or, replicas) of the original system in the canonical ensemble at M different 

temperatures T,„ (to, — 1, • • • , M). Let X = \- ■ ■ , Xm, ■ ■ ■> stand for a state in this generalized ensemble. Here, the 

superscript i and the subscript m in Xm label the replica and the temperature, respectively. The state X is specified 
by the M sets of coordinates g''' (and momenta p'-^') of the atoms in replica i at temperature Tm {i,m — 1,- ■ ■ , M): 

xH^|gW,p[n . (7) 

In Monte Carlo algorithm we need to take into account only gl*' . 

A REM simulation is then realized by alternately performing the following two steps |^ - |^ . 

1. Each replica in canonical ensemble of the fixed temperature is simulated simultaneously and independently for 
a certain MC or MD steps. 

2. A pair of replicas, say i and j, which are at neighboring temperatures, Tm and Tm±i^ respectively, are exchanged: 
X = \---,Xm,---, x^j.1 ' ' ' ' f — * -^' ^ I ' ' ' ■^™ ' ' ' ' ' -^miti ' ' ' ' f ■ '^h^ transition probability of this replica- 
exchange process is given by the Metropolis criterion: 

iv v'\ f ^ ^ for A < , , . 

^(^^^) = \exp(-A) , forA>0, (^) 

where 

A = {Pm±l - Pm) (e (gW) - E (ql-'l)) . (9) 

Here, E (g^) and E (g'-'l) are the potential energy of the i-th replica and the j-th replica, respectively. 



In the present work we employ Monte Carlo algorithm for Step 1. A random walk in temperature space is realized for 
each replica, which in turn induces a random walk in potential energy space. This alleviates the problem of getting 
trapped in states of energy local minima. 

The major advantage of REM over other generalized-ensemble methods such as MUCA [g and ST [g|,0| lies in 
the fact that the weight factor is essentially a priori known (which is just a product of Boltzmann factors), whereas 
in the latter algorithms the determination of the weight factors can be very tedious and time-consuming. However, 
the number of required replicas (or temperatures) for REM increases like ^fN where N is the number of degrees 
of freedom of the system ||25| . We will need a huge number of replicas and a large amount of computation time to 
simulate a complex system such as a real protein system, while in MUCA or ST only one replica (the original system 
itself) is simulated. This led us to combine the merits of REM and ST (the combination of REM and MUCA was 
also realized p^ , |33[ |). 

We finally present the new method which we refer to as the replica-exchange simulated tempering (REST). In this 
method we first perform a short REM simulation (with M replicas) to determine the simulated tempering weight factor 
and then perform with this weight factor a regular ST simulation with high statistics. The first step is accomplished 
by the multiple- histogram reweighting techniques y,^ as follows. Let Nm{E) and Um be respectively the potential- 
energy histogram and the total number of samples obtained at temperature T^ = l/fcs/3m of the REM run. The 
density of states, n{E), is then given by ^,y 



<E) = ^P^ , (10) 



E 



"-m iJm ^ 



where 



g-/™ ^^^(^)g-/3™B ^ (11) 

E 

Here, g^ = 1-1- 2Tm, and r^ is the integrated autocorrelation time at temperature T,„. Note that Eqs. (O) and (O) 
are solved self-consistently by iteration to obtain the density of states n{E) and the dimensionless Helmholtz free 
energy f^. 

Once the estimate of the dimensionless Helmholtz free energy f„i are obtained, the simulated tempering weight 
factor can be directly determined by using Eq. ^ where we set a™ — fm (compare Eqs. (^) and (|ll|)). A long ST 
run is then performed with this weight factor. Let Nm{E) and nm be respectively the potential-energy histogram 
and the total number of samples obtained at temperature T,„ = 1/kBPm from this ST run. The multiple-histogram 
reweighting techniques of Eqs. ( |lO| ) and ([l|) can be used again to obtain the best estimate of the density of states 
n(E). The expectation value of a physical quantity A at any temperature T (= l/ksP) is then calculated from 



Y,A{E)n{E)e 



-/3E 



<yl>T = -^^= ■ (12) 



E"(^) 



g-/3S 



III. RESULTS AND DISCUSSION 

We tested the effectiveness of the new algorithm for the system of a penta-peptide, Met-enkephalin, in gas phase. 
This peptide has the amino-acid sequence, Tyr-Gly-Gly-Phe-Met. The backbone was terminated by a neutral NH2- 
group and a neutral -COOH group at the N-terminus and at the C-terminus, respectively, as in the previous MC 
works of Met-enkephalin. The potential energy function Etot (in kcal/mol) is given by the sum of the electrostatic 
term Eq, 12-6 Lennard- Jones term i^Lj, and hydrogen-bond term Ehb for all pairs of atoms in the molecule together 
with the torsion term i^TOR for all torsion angles. The parameters in the energy function as well as the molecular 
geometry were taken from ECEPP/2 [M. The computer code KONF90 p5[ was used, and MC simulations based on 
the replica-exchange simulated tempering (REST) were performed. The dielectric constant e was set equal to 2 as in 
the previous works. The peptide-bond dihedral angles uj were fixed at the value 180° for simplicity. With this choice 
of parameters our results can be directly compared with Refs. ]10| , [l7| . 



The remaining dihedral angles constitute the variables to be updated in the MC simulations: (pk and V'fe in the 
main chain {k = 1, • • • ,5) and Xk ™^ ^^^ ^^^^ chains (there are 3 x's for Tyr, 2 x's for Phe, and 4 x's for Met). 
For Met-enkephalin the number of degrees of freedom is thus 19. One MC sweep consists of updating all these 19 
angles once with Metropolis evaluation for each update. The simulations (of all replicas) were started from randomly 
generated conformations. 

In Table I we summarize the parameters of the simulations that were performed in the present work. As described 
in the previous section, REST consists of two simulations: a short REM simulation (from which the dimcnsionless 
Helmholtz free energy, or the simulated tempering weight factor, is determined) and a subsequent ST production 
run. The former simulation is referred to as REMl and the latter as STl in Table I. In REMl there exist 8 replicas 
with 8 different temperatures (M = 8), ranging from 50 K to 1000 K as listed in Table I (i.e., Ti = 50 K and 
Tj^i = Tg = 1000 K). The same set of temperatures were also used in STl. The temperatures were distributed 
exponentially between Ti and Tm, following the optimal distribution found in the previous simulated annealing 
schedule [p5i, simulated tempering run p^, and replica-exchange simulation [p9|. Before taking the data in REMl, 
we made a REM simulation of 10,000 MC sweeps for thermalization. We then performed a REM simulation of 5 x lO'' 
MC sweeps to obtain the weight factor for simulated tempering. After estimating the weight factor, we made a ST 
production run of 10^ MC sweeps (STl), which followed additional 1,000 MC sweeps for equilibration. In REMl and 
STl, a replica exchange and a temperature update, respectively, were tried every 10 MC sweeps. Data were collected 
at each MC sweep in both REMl and STl. 

We first check whether the replica-exchange simulation of REMl indeed performed properly. For an optimal 
performance of REM the acceptance ratios of replica exchange should be sufficiently uniform and large (say, > 10 %). 
In Table II we list these quantities. It is clear that both points are met in the sense that they are of the same order 
(the values vary between 10 % and 40 %). Moreover, in Fig. 1 the canonical probability distributions obtained at the 
chosen 8 temperatures from REMl are shown. We see that there are enough overlaps between all pairs of neighboring 
distributions, indicating that there will be sufficient numbers of replica exchange between pairs of replicas (see Table 
II). We did observe random walks in potential energy space between low energies and high energies. 

After REMl, we obtained the dimcnsionless Helmholtz free energy at the eight temperatures, /,„ (to = 1, • • • , 8), 
by the multiple- histogram reweighting techniques 0.4] (namely, by solving Eqs. (QO) and (O) self-consistently). This 
gives the simulated tempering weight factor of Eq. (2) with a.^ = fm- We remark that for biomolecular systems the 
integrated autocorrelation times, r™, in the reweighting formulae can safely be set to be a constant M, and we do so 
throughout the analyses in this section. 

After determining the simulated tempering weight factor, we carried out a long ST simulation for data collection 
(STl in Table I). In Fig. 2 the time series of temperature and potential energy from STl are plotted. In Fig. 2(a) 
we observe a random walk in the temperature space between the lowest and highest temperatures. In Fig. 2(b) the 
corresponding random walk of the total potential energy between low and high energies is observed. Note that there is 
a strong correlation between the behaviors in Figs. 2(a) and 2(b), as there should. It is known from our previous works 
that the global-minimum-energy conformation for Met-enkephalin has the potential energy value of —12.2 kcal/mol 
|10| , |l7| . Hence, the random walk in Fig. 2(b) indeed visited the global-minimum region many times. It also visited 
high energy regions, judging from the fact that the average potential energy is around 15 kcal/mol at T = 1000 K 
0,0 (see Fig. 4 below). 

For an optimal performance of ST, the acceptance ratios of temperature update should be sufficiently uniform and 
large. In Tabic III we list these quantities. It is clear that both points are met (the values vary between 26 % and 
57 %); we find that the present ST run (STl) indeed properly performed. We remark that the acceptance ratios in 
Table HI are significantly larger and more uniform than those in Table II, suggesting that ST runs can sample the 
configurational space more effectively than REM runs, provided the optimal weight factor is obtained. 

In the previous works of ST simulations of Met-enkephalin in gas phase [2^,^^ , at least several iterations of trial 
simulations were required for the simulated tempering weight determination. We emphasize that in the present case 
of REST (REMl), only one simulation was necessary to determine the optimal simulated tempering weight factor 
that can cover the energy region corresponding to temperatures between 50 K and 1000 K. 

In Fig. 3 we plot the simulated tempering parameters fm ("t- — 1, • • • , 8) and the dimcnsionless Helmholtz free 
energy (inverse temperature multiplied by Helmholtz free energy) /(T) as a function of temperature T. The former 
quantities, /„j, were estimated by the multiple-histogram reweighting techniques, using the results of the first REM 
run (REMl). The latter quantity, f{T), was calculated by the multiple-histogram reweighting techniques, using the 
results of the final ST production run (STl). Namely, the density of states, n{E), was obtained by solving Eqs. (M) 
and { ^A\ j by iteration. The function /(T) was then calculated by taking a summation of n{E) exp{—(3E) over E for 
each value of T (i.e., replace fm and Pm in Eq. ( pi] ) by f{T) and /3, respectively). The results agree very well with 
each other, implying that the simulated tempering parameters, /,„, that were obtained from a short REM run are 
already quite accurate. 

To check the validity of the canonical-ensemble expectation values calculated by the new method, in Fig. 4 we 



compare the average potential energy as a function of temperature obtained from STl with that obtained from the 
previous MUCA simulation ||l^. The results for STl were obtained by the multiple- histogram method JSJQ] (see 
Eqs. (pi)|), ^n\), and (p^), whereas the single-histogram method Q was used for the results of the MUCA simulation. 
We can see a perfect coincidence of the average values in Fig. 4. 

IV. CONCLUSIONS 

In this Letter we have proposed a new algorithm for configurational sampling of frustrated systems with rough 
energy landscape. In this method, which we refer to as replica-exchange simulated tempering (REST), the simulated 
tempering weight factor is determined from the results of a short replica-exchange simulation, and then a regular 
simulated tempering production run is made with this weight. The formulation of the method is simple and straight- 
forward, but the numerical improvement is great, because the weight factor determination for simulated tempering 
becomes very difficult by the usual iterative process for complex systems. The new method was tested with the 
system of a small peptide in gas phase. The simulated tempering weight factor was indeed obtained by a single, short 
replica-exchange simulation. 
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TABLE I. Summary of parameters in REST simulations. 



Run 


No. of replicas 


Temperature, T™ (K) 


MC sweeps 


REMl 
STl 


8 
1 


50, 77, 118, 181, 277, 425, 652, 1000 
50, 77, 118, 181, 277, 425, 652, 1000 


5 X 10"* 
1 X 10« 



TABLE II. Acceptance ratios of replica exchange in REMl. 



Pair of temperatures (K) Acceptance ratio 

50 < > 77 030 

77 < — > 118 0.27 

118 < — > 181 0.22 

181 < — > 277 0.17 

277 < — > 425 0.10 

425 < — > 652 0.27 

652 < — > 1000 0.40 



TABLE III. Acceptance ratios of temperature update in STl. 



Pair of temperatures (K) Acceptance ratio 

047 
0.47 
0.43 
0.43 
0.37 
0.42 
0.29 
0.29 
0.30 
0.26 
0.43 
0.42 
0.57 
0.56 
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Figure Captions 

Fig. 1. Probability distribution of potential energy obtained at the eight temperatures from REMl (see Table I 
for the parameters of the simulation) . The left-most distribution corresponds to the lowest temperature (Ti — 50 
K), and the right-most to the highest one (Tg = 1000 K). 

Fig. 2. Time series of temperature (a) and potential energy (b) in STl (see Table I for the parameters of the 
simulation) . 

Fig. 3. Dimensionless Helmholtz free energy as a function of temperature T (K). The dotted curve is the 
result from the simulated tempering production run (STl). The crosses are the /„ (m = 1, • • • ,8) that were 
determined from the short preliminary replica-exchange run (REMl). Both results were normalized so that the 
values at T = 50 K agree with each other. 

Fig. 4. The average potential energy as a function of temperature. The solid curve was obtained by the multiple- 
histogram reweighting techniques from the results of STl. The crosses were obtained by the single-histogram 
reweighting techniques from the results of the previous multicanonical MC simulation ||l7| . 
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